Dealing with Data Imbalance in Text Classification

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dealing with Class Imbalance using Thresholding

We propose thresholding as an approach to deal with class imbalance. We define the concept of thresholding as a process of determining a decision boundary in the presence of a tunable parameter. The threshold is the maximum value of this tunable parameter where the conditions of a certain decision are satisfied. We show that thresholding is applicable not only for linear classifiers but also fo...

متن کامل

Dealing with Concept Drift and Class Imbalance in Multi-Label Stream Classification

Streams of objects that are associated with one or more labels at the same time appear in many applications. However, stream classification of multi-label data is largely unexplored. Existing approaches try to tackle the problem by transferring traditional single-label stream classification practices to the multi-label domain. Nevertheless, they fail to consider some of the unique properties of...

متن کامل

Dealing with Multiple Classes in Online Class Imbalance Learning

Online class imbalance learning deals with data streams having very skewed class distributions in a timely fashion. Although a few methods have been proposed to handle such problems, most of them focus on two-class cases. Multi-class imbalance imposes additional challenges in learning. This paper studies the combined challenges posed by multiclass imbalance and online learning, and aims at a mo...

متن کامل

the clustering and classification data mining techniques in insurance fraud detection:the case of iranian car insurance

با توجه به گسترش روز افزون تقلب در حوزه بیمه به خصوص در بخش بیمه اتومبیل و تبعات منفی آن برای شرکت های بیمه، به کارگیری روش های مناسب و کارآمد به منظور شناسایی و کشف تقلب در این حوزه امری ضروری است. درک الگوی موجود در داده های مربوط به مطالبات گزارش شده گذشته می تواند در کشف واقعی یا غیرواقعی بودن ادعای خسارت، مفید باشد. یکی از متداول ترین و پرکاربردترین راه های کشف الگوی داده ها استفاده از ر...

A Novel Field Learning Algorithm for Dual Imbalance Text Classification

Fish-net algorithm is a novel field learning algorithm which derives classification rules by looking at the range of values of each attribute instead of the individual point values. In this paper, we present a Feature Selection Fish-net learning algorithm to solve the Dual Imbalance problem on text classification. Dual imbalance includes the instance imbalance and feature imbalance. The instanc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Procedia Computer Science

سال: 2019

ISSN: 1877-0509

DOI: 10.1016/j.procs.2019.09.229